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ABSTRACT 


This  paper  deals  with  the  estimation  of 
the  regression  coefficients  when  the  data  are 
sequences  of  Bernoulli  random  variables  that 
form  Markov  chains.  The  method  used  Is  an 
extension  of  Klotz's  papers. 


REGRESSION  FOR  MARKOV  BERNOULLI  RANDOM  VARIABLES 


1.  INTRODUCTION.  The  problem  studied  was  that  of  regression  on  Bernoulli 
random  variables  In  the  case  where  some  of  the  random  variables  were 
dependent.  The  Interest  In  this  case  arose  from  a problem  of  trying  to 
fit  probability  of  hit  curves  to  data  generated  by  repeated  missile 
simulations  performed  at  US  Army  Materiel  Systems  Analysis  Activity  using 
tracking  data  from  the  Antitank  Missile  Test  (ATMT).  Hit/miss  data  were 
generated  one  second  apart.  Because  overlapping  tracking  data  were  used* 
successive  shots  were  dependent.  This  caused  problems  that  seemed  In- 
surmountable until  the  author  became  aware  of  Klotz's  papers  (1)  (2). 

In  these  papers  the  parameters  of  a sequence  of  Bernoulli  dependent 
random  variables  satisfy  the  Markov  chain  property.  In  the  case  of 
successive  shots,  the  assumption  of  Markov  chain  seemed  reasonable  and 
was  used  to  solve  the  problem.  Klotz's  technique  was  extended  to  the 
regression  problem. 

2.  PRELIMINARIES.  In  the  generated  data  the  following  occurred:  for 
several  different  ranges,  a number  of  gunners  (the  number  was  not  the  same 
for  all  ranges)  fired  a sequence  of  shots  (not  all  the  same  sequence 
length).  The  shots  were  fired  a second  apart.  Let  X(I,J,R)  be  the  results 
of  the  Ith  shot  of  the  Jth  gunner  at  range  R.  A hit  caused  X to  be  1 and 

a miss  caused  It  to  be  0.  The  notation  that  Is  now  Introduced  Is  that  of 
Klotz  but  modified  to  the  needs  of  the  problem  under  consideration.  The 
first  probability  of  hit  Is: 

P(R)  - Pr  |X1jRJ  • bQ  + bjR  + bjR2  Eq  1 

which,  as  shown  In  the  above  equation.  Is  taken  to  be  a second  degree 
polynomial  In  R.  Next,  the  probability  of  a hit  given  that  the  previous 
shot  was  a hit.  Is: 


Pn(R)  ■ x(R)  - Pr  |x1jR  « 1|XM  jR  * l|  ■ a0  + alR  + *2^  Eq  2 

which  Is  also  taken  as  a second  degree  polynomial  In  R and  which  Is  the 
lower  right  hand  term  In  the  transition  matrix.  Clearly,  equations  1 and 
2 hold  only  when  the  sequences  are  stationary,  which  was  a reasonable 
assimptlon  for  the  problem  considered.  The  remaining  three  terms  of  the 
transition  matrix  are: 


X1JR  " °lx1-l  JR  * 1 1 Eq  3 
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P01(R)  - 1 - X (R)  - Pr 


pio<R>  * P1Rj[-  wi*13  ■ Pr ! xm  * JR  ■ °!  E" 4 


POo(R)  ■ 1 - Pl0(R)  ■ pr  I X1jR  “ °IX1-1  jR  * 0 


Eq  5 


3.  LIKELIHOOD.  Having  the  above  machinery,  the  joint  probability  of  the 
data  Is: 


Pr  xijR's  -r  j-i  J p<r)Xijr  [i  - p(R> 


1 - X 


1 jR 


X 


^1 10  (1  * E1-l  JR^ 


* PU(R)  W f-1  JR  Plo0O  4JR  w 
P„,(R)(1  ' X,Jr)  X,-‘  JR  Pm(R)(1  ' XiJrKI  ' X'-1  ») 


Eq  6 


where: 

Nr  » number  of  gunners  firing  at  range  R 

n^R  ■ number  of  shots  by  the  jth  gunner  at  the  Rth  range. 
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Substituting  rjR,  SjR,  and  tjR  as  described  In  equations  8 through  10, 
equation  6 becomes: 

Pr  |xu(t's!  ■ l Jx(R)rjRa  - x(r))2<Sjr  ' rJR>  ~ ‘JR 

r f "jR  " 1 " 2SjR  + r, 

[l  - 2P(R)  + X(R)P(R)J 


p<r)(SJr  ‘ V[i . p(r)]  ‘(nJ«  ‘ 2 ' SJ' 

Eq  7 

Where: 

"jR 

rJR  * X1-l  jR  X1jR  Eq  8 


Then  the  likelihood  function  Is: 


N(R)  r 

L * z z^  rjRlnx(R)  + |2(SjR  - jR)  - tjRJ  ln  ( 1 - x (R) ) 


+ (njR  - 1 - 2SjR  + rjR  + tjR)  In  (1  - 2P(R)  + x (R)P(R)) 


+ <SJR  - rJR>  lnP<R> 


Eq  11 


- ‘"JR  ' 2 ' SJR  + V <l  * P<R» 


Now,  substituting  x(R)  and  pij(r)  the  likelihood  function  one  has: 


R J-l 


L - z z r<D  In  ( z a.RK) 


2 

+ [2(SjR  “ rjR*  “ *Jr]  ln  ^ “ k^0  akRk} 
+ [nJR  - 1 - 2SjR  + rjR  + tjRJ 


In  (1  - 2 Z b Rq  + z b Rq  z a.R*) 
q-o  q q.0  q k-o  k 


+ (SJR  * rjR}  ln  (nL  bqR<,) 


Eq  12 


*nJR  ‘ 2 “ SjR  + *Jr)  ln  ^ ’ * bqR<,) 


To  find  the  maximum  likelihood  estimates  of  the  regression  coefficients, 
partial  derivatives  of  the  likelihood  function  with  respect  to  the  a's  and 
b's  are  required.  These  partial s are: 


, N(R)  r Rm 

- Z l 

» R j=1  j a Rk 

k=0  k 


,2<SiR  ~^r1R^  ~ *1R  rB1 

1 - z a. Rk 
k=0  K 


(nJR  ' 1 “ ^jR  * tjR)  ^ q^0  bqRq 

2 n 2 n 2 J 
1 - 2 z b„Rq  + z b„Rq  z auRK 
q*0  q q=0  q k*0  K 


Eq  13 


N(R) 
z z 
R j*l 


(njR  " 1 ' 2SjR  + rjR  + + R"  k^0  akR*> 

2 n 2 n 2 t 

1 - 2 Z b_Rq  + Z b Rq  Z a.  Rk 

q*0  q q*0  q k*0  K 


Eq  14 


” 2 * + R 


1R  'jr'"  + 

Z b Rq 
<1 


1 - Z b_Rq 


These  expressions  are  set  to  zero  and  solved  for  the  a's  and  b's.  It  Is  clear 
that  the  solutions  must  be  obtained  by  Iterative  methods.  A program  was  written 
to  do  this  using  the  Newton  Raphson  method  (3). 
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4.  CONCLUSION.  Recall  that  the  problem  discussed  In  the  Introduction  was 
the  problem  of  fitting  probability  of  hit  curves  to  data  generated  by  re- 
peated missile  simulations.  The  curves  were  assumed  to  be  quadratic 
functions  of  R expressed  as  follows: 


P(R)  * bQ  + bjR  + b2R2 

Eq  1 

'jj(R)  * 3q  "*■  + *2^ 

Eq  2 

Hence,  utilization  of  the  maximum  likelihood  technique,  given  by  equations 
11  through  14  above,  and  subsequent  solution  by  the  Newton  Raphson  method, 
provides  the  values  of  the  coefficients,  a's  and  b's,  necessary  to  achieving 
a maximum  likelihood  "best  fit"  of  equations  1 and  2 to  their  respective 
data  points. 
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